Causal Inference 1

Digging Deeper

Jeremy Springman

University of Pennsylvania

Global Development: Intermediate Topics in Politics, Policy, and Data

PSCI 3200 - Spring 2024

Logistics

Assignments

  • If you didn’t post your git handle on Slack, please do
  • Next week:
    • Readings
    • Install Quarto and create an empty pdf
    • I’ll create a page before Saturday

Agenda

  1. How to follow-along with the slides
  2. Estimating Causal Effects with Randomized Experiments
  3. Economists as Plumbers

Estimating Causal Effects with Randomized Experiments

How do we think about causality?

Potential Outcomes

\[ Y_i = \begin{cases} Y_i(1) & \text{if } D_i = 1 \text{ (treatment group)} \\ Y_i(0) & \text{if } D_i = 0 \text{ (control group)} \end{cases} \]

Treatment Effect for individual \(i\)

\[ TE_i = Y_i(1) - Y_i(0) \]

What’s the problem?


Fundamental Problem of Causal Inference

  • We only observe any given unit in one treatment status at any one time so we can never directly observe the causal effect of a treatment on an individual unit

What’s the solution?

Counterfactuals

  • Individuals in the control group serve as a stand-in for the counter-factual of the treatment group

\[ \widehat{ATE} = \overline{Y}_{treatment\_group} - \overline{Y}_{control\_group} \]

What’s complicated about this?

  • “Only valid when when the treatment and control group are comparable with respect to all the variables that might affect the outcome other than the treatment variable itself.”
  • “We must find or create a situation in which the treated observations and the untreated observations are similar with respect to all the variables that might affect the outcome”
  • “By randomly assigning treatment, we ensure that treatment and control groups are, on average, identical to each other in all observed and unobserved pre-treatment characteristics”

Can’t we just observe and compare?

Can’t we just observe and compare?


Example: What is the effect of class size on test scores

Can’t we just observe and compare?

Can’t we just observe and compare?

Can’t we just observe and compare?

Can’t we just observe and compare?

Why can’t we just observe how individuals change over time?

Show code
library(ggplot2)

Year = c(0,1,2,3)
Outcome = c(NA, 1.3, 1.7,NA)
Treatment = c("Control", "Control","Control","Control")

dat = data.frame(Year, Outcome, Treatment)

ggplot(data = dat, aes(x = Year, y = Outcome, group = Treatment, color = Treatment)) +
  geom_line(aes(linetype=Treatment),size=2) +
  geom_point(size = 6) +
  scale_linetype_manual(values=c("solid")) +
  xlim(0,3) + 
  scale_y_continuous(limits = c(1,1.85), breaks = seq(1, 1.85, by = .1)) + 
  scale_color_manual(values = c("blue") ) +
  theme(legend.position = "none", text = element_text(size=20)) 

Why can’t we just observe how individuals change over time?

Show code
Year = c(0,1,2,3)
Outcome = c(0.9, 1.3, 1.7, 2.1)
Treatment = c("Control", "Control","Control","Control")

dat = data.frame(Year, Outcome, Treatment)

ggplot(data = dat, aes(x = Year, y = Outcome, group = Treatment, color = Treatment)) +
  geom_line(aes(linetype=Treatment),size=2) +
  geom_point(size = 6) +
  xlim(0,3) + 
  scale_y_continuous(breaks = seq(1, 1.85, by = .1)) + 
  scale_linetype_manual(values=c("solid", "solid")) +
  scale_color_manual(values = c("blue") ) +
  coord_cartesian(ylim = c(1, 1.85), clip = "on") +
  theme(legend.position = "none", text = element_text(size=20))

Why can’t we just observe how individuals change over time?

Show code
Year = c(0,1,2,3)
Outcome = c(NA, 1.2, 1.4, NA, 
            NA, 1.3, 1.7, NA)
Treatment = c("Control", "Control","Control","Control", 
              "Treatment", "Treatment", "Treatment", "Treatment")

dat = data.frame(Year, Outcome, Treatment)

ggplot(data = dat, aes(x = Year, y = Outcome, group = Treatment, color = Treatment)) +
  geom_line(aes(linetype=Treatment),size=2) +
  geom_point(size = 6) +
  xlim(0,3) + 
  scale_y_continuous(limits = c(1,1.85), breaks = seq(1, 1.85, by = .1)) + 
  scale_linetype_manual(values=c("solid", "solid")) +
  scale_color_manual(values = c("red", "blue") ) +
  theme(legend.position = c(0.8, 0.2), text = element_text(size=20))

Why can’t we just observe how individuals change over time?

Show code
Year = c(0,1,2,3)
Outcome = c(1, 1.2, 1.4, 1.6, 
            0.9, 1.3, 1.7, 2.1)
Treatment = c("Control", "Control","Control","Control", 
              "Treatment", "Treatment", "Treatment", "Treatment")

dat = data.frame(Year, Outcome, Treatment)


ggplot(data = dat, aes(x = Year, y = Outcome, group = Treatment, color = Treatment)) +
  geom_line(aes(linetype=Treatment),size=2) +
  geom_point(size = 6) +
  xlim(0,3) + 
  scale_y_continuous(breaks = seq(1, 1.85, by = .1)) + 
  scale_linetype_manual(values=c("solid", "solid")) +
  scale_color_manual(values = c("red", "blue") ) +
  coord_cartesian(ylim = c(1, 1.85), clip = "on") +
  theme(legend.position = c(0.8, 0.2), text = element_text(size=20))

Why can’t we just observe how individuals change over time?

Show code
Year = c(0,1,2,3)
Outcome = c(NA, 1.2, 1.4, NA, 
            NA, 1.3, 1.7, NA, 
            NA, 1.3, 1.5, NA)
Treatment = c("Control", "Control","Control","Control", 
              "Treatment", "Treatment", "Treatment", "Treatment",
              "Comparison","Comparison","Comparison","Comparison")

dat = data.frame(Year, Outcome, Treatment)


ggplot(data = dat, aes(x = Year, y = Outcome, group = Treatment, color = Treatment)) +
  geom_line(aes(linetype=Treatment),size=2) +
  geom_point(size = 6) +
  xlim(0,3) + 
  scale_y_continuous(limits = c(1,1.85), breaks = seq(1, 1.85, by = .1)) + 
  scale_linetype_manual(values=c("dotted", "solid", "solid")) +
  scale_color_manual(values = c("black", "red", "blue") ) +
  theme(legend.position = c(0.8, 0.2), text = element_text(size=20))

Dealing with Small Sample Sizes

  • Re-randomization
  • Blocking
  • Non-bipartite matching

Counterpoint

Economists as Plumbers

Duflo: big picture + details

Why focus on details: 1. policymakers don’t have time to focus on details 2. details make all of the difference

Example of water connections: - Big picture: people really need water - Details: they won’t navigate bureaucracy

  1. Design of the tap: details about communication or defaults
  2. Layout of the pipes: logistics of authority and responsibility

Plumbing vs Science

  • “It turns out that most policymakers, and most bureaucrats, are not very good plumbers.”
  • “And not all plumbers need to be economists. Sometimes, what a policymaker needs is a good software engineer, lawyer, or subject expert” “To summarize, economists have the disciplinary training to make good plumbers: economics trains us in behavioral science, incentives issues, and firm behavior; it also gives us an understanding of both governments and firms as organizations, though more work probably remains to be done there. We economists are even equipped to think about market equilibrium consequences of apparently small changes. This comparative advantage, along with the importance of getting these issues right, makes it a responsibility for our profession to engage with the world on those terms.” “Plumbing experiments, since they are primarily motivated by pragmatism, must focus on what is important for the world, not necessarily on the very subtle issues that theorists would find worth discussing.” “Scientists design general frames, engineers turn them into relevant machinery, and plumbers finally make them work in a complicated, messy policy environment.”

Any normatively uncomfortable findings?

M2: What do we mean by ‘development’ and why do we study it? - What causes development and can we cause it to happen? Probably not really. - So what is the role of social scientists: Duflo; help at the margins. Answer questions like: – basic research: what do people want/need? What is preventing that from happening? – applied research: do specific things work to resolve these issues? – “There may not be a theory fully worked out to accommodate these features, but she can use computation and lab experiments to simulate how they will play out” – “The uncertainty in the environment creates a highly stochastic world: the natural way to “pay attention” to what happens, as I will argue below, is thus to analyze natural experiments or set up field experiments to try out different plumbing possibilities.